Handlers Module¶
gigaspatial.handlers ¶
base ¶
BaseHandler ¶
Bases: ABC
Abstract base class that orchestrates configuration, downloading, and reading functionality.
This class serves as the main entry point for dataset handlers, providing a unified interface for data acquisition and loading. It manages the lifecycle of config, downloader, and reader components.
Subclasses should implement the abstract methods to provide specific handler types and define how components are created and interact.
Source code in gigaspatial/handlers/base.py
502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 | |
config: BaseHandlerConfig property ¶
Get the configuration object.
downloader: BaseHandlerDownloader property ¶
Get the downloader object.
reader: BaseHandlerReader property ¶
Get the reader object.
__enter__() ¶
__exit__(exc_type, exc_val, exc_tb) ¶
__init__(config=None, downloader=None, reader=None, data_store=None, logger=None) ¶
Initialize the BaseHandler with optional components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[BaseHandlerConfig] | Configuration object. If None, will be created via create_config() | None |
downloader | Optional[BaseHandlerDownloader] | Downloader instance. If None, will be created via create_downloader() | None |
reader | Optional[BaseHandlerReader] | Reader instance. If None, will be created via create_reader() | None |
data_store | Optional[DataStore] | Data store instance. Defaults to LocalDataStore if not provided | None |
logger | Optional[Logger] | Logger instance. If not provided, creates one based on class name | None |
Source code in gigaspatial/handlers/base.py
__repr__() ¶
String representation of the handler.
Source code in gigaspatial/handlers/base.py
cleanup() ¶
Cleanup resources used by the handler.
Override in subclasses if specific cleanup is needed.
create_config(data_store, logger, **kwargs) abstractmethod ¶
Create and return a configuration object for this handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
BaseHandlerConfig | Configured BaseHandlerConfig instance |
Source code in gigaspatial/handlers/base.py
create_downloader(config, data_store, logger, **kwargs) abstractmethod ¶
Create and return a downloader object for this handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | BaseHandlerConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
BaseHandlerDownloader | Configured BaseHandlerDownloader instance |
Source code in gigaspatial/handlers/base.py
create_reader(config, data_store, logger, **kwargs) abstractmethod ¶
Create and return a reader object for this handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | BaseHandlerConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
BaseHandlerReader | Configured BaseHandlerReader instance |
Source code in gigaspatial/handlers/base.py
download_and_load(source, crop_to_source=False, force_download=False, **kwargs) ¶
Convenience method to download (if needed) and load data in one call.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
force_download | bool | If True, download even if data exists locally | False |
**kwargs | Additional parameters | {} |
Returns:
| Type | Description |
|---|---|
Any | Loaded data |
Source code in gigaspatial/handlers/base.py
ensure_data_available(source, force_download=False, **kwargs) ¶
Ensure that data is available for the given source.
This method checks if the required data exists locally, and if not (or if force_download is True), downloads it using the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
force_download | bool | If True, download even if data exists locally | False |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
| Name | Type | Description |
|---|---|---|
bool | bool | True if data is available after this operation |
Source code in gigaspatial/handlers/base.py
637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 | |
get_available_data_info(source, **kwargs) ¶
Get information about available data for the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame] | The data source specification | required |
**kwargs | Additional parameters | {} |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Information about data availability, paths, etc. |
Source code in gigaspatial/handlers/base.py
load_data(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load data from the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
Any | Loaded data (type depends on specific handler implementation) |
Source code in gigaspatial/handlers/base.py
BaseHandlerConfig dataclass ¶
Bases: ABC
Abstract base class for handler configuration objects. Provides standard fields for path, parallelism, data store, and logger. Extend this class for dataset-specific configuration.
Source code in gigaspatial/handlers/base.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
clear_unit_cache() ¶
extract_search_geometry(source, **kwargs) ¶
General method to extract a canonical geometry from supported source types.
Source code in gigaspatial/handlers/base.py
get_data_unit_path(unit, **kwargs) abstractmethod ¶
get_data_unit_paths(units, **kwargs) ¶
Given data unit identifiers, return the corresponding file paths.
Source code in gigaspatial/handlers/base.py
get_relevant_data_units_by_geometry(geometry, **kwargs) abstractmethod ¶
Given a geometry, return a list of relevant data unit identifiers (e.g., tiles, files, resources).
Source code in gigaspatial/handlers/base.py
BaseHandlerDownloader ¶
Bases: ABC
Abstract base class for handler downloader classes. Standardizes config, data_store, and logger initialization. Extend this class for dataset-specific downloaders.
Source code in gigaspatial/handlers/base.py
BaseHandlerReader ¶
Bases: ABC
Abstract base class for handler reader classes. Provides common methods for resolving source paths and loading data. Supports resolving by country, points, geometry, GeoDataFrame, or explicit paths. Includes generic loader functions for raster and tabular data.
Source code in gigaspatial/handlers/base.py
210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 | |
load(source, crop_to_source=False, **kwargs) ¶
Load data from the given source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame, Path, str, List[Union[str, Path]]] | The data source (country code/name, points, geometry, paths, etc.). | required |
crop_to_source | bool, default False If True, crop loaded data to the exact source geometry | False | |
**kwargs | Additional parameters to pass to the loading process. | {} |
Returns:
| Type | Description |
|---|---|
Any | The loaded data. The type depends on the subclass implementation. |
Source code in gigaspatial/handlers/base.py
load_from_paths(source_data_path, **kwargs) abstractmethod ¶
Abstract method to load source data from paths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_data_path | List[Union[str, Path]] | List of source paths | required |
**kwargs | Additional parameters for data loading | {} |
Returns:
| Type | Description |
|---|---|
Any | Loaded data (DataFrame, GeoDataFrame, etc.) |
Source code in gigaspatial/handlers/base.py
resolve_by_paths(paths, **kwargs) ¶
Return explicit paths as a list.
Source code in gigaspatial/handlers/base.py
resolve_source_paths(source, **kwargs) ¶
Resolve source data paths based on the type of source input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame, Path, str, List[Union[str, Path]]] | Can be a country code or name (str), list of points, geometry, GeoDataFrame, or explicit path(s) | required |
**kwargs | Additional parameters for path resolution | {} |
Returns:
| Type | Description |
|---|---|
List[Union[str, Path]] | List of resolved source paths |
Source code in gigaspatial/handlers/base.py
boundaries ¶
AdminBoundaries ¶
Bases: BaseModel
Base class for administrative boundary data with flexible fields.
Source code in gigaspatial/handlers/boundaries.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 | |
create(country_code=None, admin_level=0, data_store=None, path=None, **kwargs) classmethod ¶
Factory method to create an AdminBoundaries instance using various data sources, depending on the provided parameters and global configuration.
Loading Logic
-
If a
data_storeis provided and either apathis given orglobal_config.ADMIN_BOUNDARIES_DATA_DIRis set:- If
pathis not provided butcountry_codeis, the path is constructed usingglobal_config.get_admin_path(). - Loads boundaries from the specified data store and path.
- If
-
If only
country_codeis provided (no data_store):- Attempts to load boundaries from GeoRepo (if available).
- If GeoRepo is unavailable, attempts to load from GADM.
- If GADM fails, falls back to geoBoundaries.
- Raises an error if all sources fail.
-
If neither
country_codenordata_storeis provided:- Raises a ValueError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country_code | Optional[str] | ISO country code (2 or 3 letter) or country name. | None |
admin_level | int | Administrative level (0=country, 1=state/province, etc.). | 0 |
data_store | Optional[DataStore] | Optional data store instance for loading from existing data. | None |
path | Optional[Union[str, Path]] | Optional path to data file (used with data_store). | None |
**kwargs | Additional arguments passed to the underlying creation methods. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
AdminBoundaries | AdminBoundaries | Configured instance. |
Raises:
| Type | Description |
|---|---|
ValueError | If neither country_code nor (data_store, path) are provided, or if country_code lookup fails. |
RuntimeError | If all data sources fail to load boundaries. |
Examples:
Load from a data store (path auto-generated if not provided)¶
boundaries = AdminBoundaries.create(country_code="USA", admin_level=1, data_store=store)
Load from a specific file in a data store¶
boundaries = AdminBoundaries.create(data_store=store, path="data.shp")
Load from online sources (GeoRepo, GADM, geoBoundaries)¶
boundaries = AdminBoundaries.create(country_code="USA", admin_level=1)
Source code in gigaspatial/handlers/boundaries.py
344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 | |
from_data_store(data_store, path, admin_level=0, **kwargs) classmethod ¶
Load and create instance from internal data store.
Source code in gigaspatial/handlers/boundaries.py
from_gadm(country_code, admin_level=0, **kwargs) classmethod ¶
Load and create instance from GADM data.
Source code in gigaspatial/handlers/boundaries.py
from_georepo(country_code=None, admin_level=0, **kwargs) classmethod ¶
Load and create instance from GeoRepo (UNICEF) API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country | Country name (if using name-based lookup) | required | |
iso3 | ISO3 code (if using code-based lookup) | required | |
admin_level | int | Administrative level (0=country, 1=state, etc.) | 0 |
api_key | GeoRepo API key (optional) | required | |
email | GeoRepo user email (optional) | required | |
kwargs | Extra arguments (ignored) | {} |
Returns:
| Type | Description |
|---|---|
AdminBoundaries | AdminBoundaries instance |
Source code in gigaspatial/handlers/boundaries.py
from_global_country_boundaries(scale='medium') classmethod ¶
Load global country boundaries from Natural Earth Data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scale | str | One of 'large', 'medium', 'small'. - 'large' -> 10m - 'medium' -> 50m - 'small' -> 110m | 'medium' |
Source code in gigaspatial/handlers/boundaries.py
get_schema_config() classmethod ¶
to_geodataframe() ¶
Convert the AdminBoundaries to a GeoDataFrame.
Source code in gigaspatial/handlers/boundaries.py
AdminBoundary ¶
Bases: BaseModel
Base class for administrative boundary data with flexible fields.
Source code in gigaspatial/handlers/boundaries.py
ghsl ¶
CoordSystem ¶
GHSLDataConfig dataclass ¶
Bases: BaseHandlerConfig
Source code in gigaspatial/handlers/ghsl.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 | |
__repr__() ¶
Return a string representation of the GHSL dataset configuration.
Source code in gigaspatial/handlers/ghsl.py
compute_dataset_url(tile_id=None) ¶
Compute the download URL for a GHSL dataset.
Source code in gigaspatial/handlers/ghsl.py
get_data_unit_path(unit=None, file_ext='.zip', **kwargs) ¶
Construct and return the path for the configured dataset or dataset tile.
Source code in gigaspatial/handlers/ghsl.py
get_relevant_data_units_by_geometry(geometry, **kwargs) ¶
Return intersecting tiles for a given geometry or GeoDataFrame.
Source code in gigaspatial/handlers/ghsl.py
validate_configuration() ¶
Validate that the configuration is valid based on dataset availability constraints.
Specific rules:¶
Source code in gigaspatial/handlers/ghsl.py
GHSLDataDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of GHSL datasets.
Source code in gigaspatial/handlers/ghsl.py
320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 | |
__init__(config, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[GHSLDataConfig, dict[str, Union[str, int]]] | Configuration for the GHSL dataset, either as a GHSLDataConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/ghsl.py
download(source, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Download GHSL data for a specified geographic region.
The region can be defined by a country code/name, a list of points, a Shapely geometry, or a GeoDataFrame. This method identifies the relevant GHSL tiles intersecting the region and downloads the specified type of data (polygons or points) for those tiles in parallel.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[Tuple[float, float], Point]], BaseGeometry, GeoDataFrame] | Defines the geographic area for which to download data. Can be: - A string representing a country code or name. - A list of (latitude, longitude) tuples or Shapely Point objects. - A Shapely BaseGeometry object (e.g., Polygon, MultiPolygon). - A GeoDataFrame with geometry column in EPSG:4326. | required |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional keyword arguments. These will be passed down to | {} |
Returns:
| Type | Description |
|---|---|
List[Optional[Union[Path, List[Path]]]] | A list of local file paths for the successfully downloaded tiles. |
List[Optional[Union[Path, List[Path]]]] | Returns an empty list if no data is found for the region or if |
List[Optional[Union[Path, List[Path]]]] | all downloads fail. |
Source code in gigaspatial/handlers/ghsl.py
download_by_country(country_code, data_store=None, country_geom_path=None, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Download GHSL data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country_code | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional keyword arguments that are passed to | {} |
Returns:
| Type | Description |
|---|---|
List[Optional[Union[Path, List[Path]]]] | A list of local file paths for the successfully downloaded tiles |
List[Optional[Union[Path, List[Path]]]] | for the specified country. |
Source code in gigaspatial/handlers/ghsl.py
download_data_unit(tile_id, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Downloads and optionally extracts files for a given tile.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tile_id | str | tile ID to process. | required |
extract | bool | If True and the downloaded file is a zip, extract its contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
| Type | Description |
|---|---|
Optional[Union[Path, List[Path]]] | Path to the downloaded file if extract=False, |
Optional[Union[Path, List[Path]]] | List of paths to the extracted files if extract=True, |
Optional[Union[Path, List[Path]]] | None on failure. |
Source code in gigaspatial/handlers/ghsl.py
342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 | |
download_data_units(tile_ids, extract=True, file_pattern='.*\\.tif$', **kwargs) ¶
Downloads multiple tiles in parallel, with an option to extract them.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tile_ids | List[str] | A list of tile IDs to download. | required |
extract | bool | If True and the downloaded files are zips, extract their contents. Defaults to True. | True |
file_pattern | Optional[str] | Optional regex pattern to filter extracted files (if extract=True). | '.*\\.tif$' |
**kwargs | Additional parameters passed to download methods | {} |
Returns:
| Type | Description |
|---|---|
List[Optional[Union[Path, List[Path]]]] | A list where each element corresponds to a tile ID and contains: |
List[Optional[Union[Path, List[Path]]]] |
|
List[Optional[Union[Path, List[Path]]]] |
|
List[Optional[Union[Path, List[Path]]]] |
|
Source code in gigaspatial/handlers/ghsl.py
GHSLDataHandler ¶
Bases: BaseHandler
Handler for GHSL (Global Human Settlement Layer) dataset.
This class provides a unified interface for downloading and loading GHSL data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/ghsl.py
660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 | |
__init__(product, year=2020, resolution=100, config=None, downloader=None, reader=None, data_store=None, logger=None, **kwargs) ¶
Initialize the GHSLDataHandler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
product | Literal['GHS_BUILT_S', 'GHS_BUILT_H_AGBH', 'GHS_BUILT_H_ANBH', 'GHS_BUILT_V', 'GHS_POP', 'GHS_SMOD'] | The GHSL product to use. Must be one of: - GHS_BUILT_S: Built-up surface - GHS_BUILT_H_AGBH: Average building height - GHS_BUILT_H_ANBH: Average number of building heights - GHS_BUILT_V: Building volume - GHS_POP: Population - GHS_SMOD: Settlement model | required |
year | int | The year of the data (default: 2020) | 2020 |
resolution | int | The resolution in meters (default: 100) | 100 |
config | Optional[GHSLDataConfig] | Optional configuration object | None |
downloader | Optional[GHSLDataDownloader] | Optional downloader instance | None |
reader | Optional[GHSLDataReader] | Optional reader instance | None |
data_store | Optional[DataStore] | Optional data store instance | None |
logger | Optional[Logger] | Optional logger instance | None |
**kwargs | Additional configuration parameters | {} |
Source code in gigaspatial/handlers/ghsl.py
create_config(data_store, logger, **kwargs) ¶
Create and return a GHSLDataConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
GHSLDataConfig | Configured GHSLDataConfig instance |
Source code in gigaspatial/handlers/ghsl.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a GHSLDataDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GHSLDataConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
GHSLDataDownloader | Configured GHSLDataDownloader instance |
Source code in gigaspatial/handlers/ghsl.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a GHSLDataReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GHSLDataConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
GHSLDataReader | Configured GHSLDataReader instance |
Source code in gigaspatial/handlers/ghsl.py
load_into_dataframe(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load GHSL data into a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame containing the GHSL data |
Source code in gigaspatial/handlers/ghsl.py
load_into_geodataframe(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load GHSL data into a geopandas GeoDataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing the GHSL data |
Source code in gigaspatial/handlers/ghsl.py
GHSLDataReader ¶
Bases: BaseHandlerReader
Source code in gigaspatial/handlers/ghsl.py
__init__(config, data_store=None, logger=None) ¶
Initialize the reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[GHSLDataConfig, dict[str, Union[str, int]]] | Configuration for the GHSL dataset, either as a GHSLDataConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/ghsl.py
load_from_paths(source_data_path, merge_rasters=False, **kwargs) ¶
Load TifProcessors from GHSL dataset. Args: source_data_path: List of file paths to load merge_rasters: If True, all rasters will be merged into a single TifProcessor. Defaults to False. Returns: Union[List[TifProcessor], TifProcessor]: List of TifProcessor objects for accessing the raster data or a single TifProcessor if merge_rasters is True.
Source code in gigaspatial/handlers/ghsl.py
giga ¶
GigaSchoolLocationFetcher ¶
Fetch and process school location data from the Giga School Geolocation Data API.
Source code in gigaspatial/handlers/giga.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
fetch_locations(process_geospatial=False, **kwargs) ¶
Fetch and process school locations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
process_geospatial | bool | Whether to process geospatial data and return a GeoDataFrame. Defaults to False. | False |
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch | {} |
Returns:
| Type | Description |
|---|---|
Union[DataFrame, GeoDataFrame] | pd.DataFrame: School locations with geospatial info. |
Source code in gigaspatial/handlers/giga.py
GigaSchoolMeasurementsFetcher ¶
Fetch and process school daily realtime connectivity measurements from the Giga API. This includes download/upload speeds, latency, and connectivity performance data.
Source code in gigaspatial/handlers/giga.py
360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 | |
fetch_measurements(**kwargs) ¶
Fetch and process school connectivity measurements.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - giga_id_school: Override default giga_id_school filter - start_date: Override default start_date - end_date: Override default end_date | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: School measurements with connectivity performance data. |
Source code in gigaspatial/handlers/giga.py
430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 | |
get_performance_summary(df) ¶
Generate a comprehensive summary of connectivity performance metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | DataFrame with measurement data | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Summary statistics about connectivity performance |
Source code in gigaspatial/handlers/giga.py
636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 | |
get_school_performance_comparison(df, top_n=10) ¶
Compare performance across schools.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | DataFrame with measurement data | required |
top_n | int | Number of top/bottom schools to include | 10 |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | School performance comparison |
Source code in gigaspatial/handlers/giga.py
GigaSchoolProfileFetcher ¶
Fetch and process school profile data from the Giga School Profile API. This includes connectivity information and other school details.
Source code in gigaspatial/handlers/giga.py
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 | |
fetch_profiles(**kwargs) ¶
Fetch and process school profiles including connectivity information.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Additional parameters for customization - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - giga_id_school: Override default giga_id_school filter | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: School profiles with connectivity and geospatial info. |
Source code in gigaspatial/handlers/giga.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 | |
get_connectivity_summary(df) ¶
Generate a summary of connectivity statistics from the fetched data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | DataFrame with school profile data | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Summary statistics about connectivity |
Source code in gigaspatial/handlers/giga.py
google_open_buildings ¶
GoogleOpenBuildingsConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for Google Open Buildings dataset files. Implements the BaseHandlerConfig interface for data unit resolution.
Source code in gigaspatial/handlers/google_open_buildings.py
get_data_unit_path(unit, data_type='polygons', **kwargs) ¶
Given a tile row or tile_id, return the corresponding file path.
Source code in gigaspatial/handlers/google_open_buildings.py
get_data_unit_paths(units, data_type='polygons', **kwargs) ¶
Given data unit identifiers, return the corresponding file paths.
Source code in gigaspatial/handlers/google_open_buildings.py
get_relevant_data_units_by_geometry(geometry, **kwargs) ¶
Return intersecting tiles for a given geometry or GeoDataFrame.
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of Google's Open Buildings dataset.
Source code in gigaspatial/handlers/google_open_buildings.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 | |
__init__(config=None, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[GoogleOpenBuildingsConfig] | Optional configuration for file paths and download settings. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/google_open_buildings.py
download_by_country(country, data_type='polygons', data_store=None, country_geom_path=None) ¶
Download Google Open Buildings data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_type | Literal['polygons', 'points'] | The type of building data to download ('polygons' or 'points'). Defaults to 'polygons'. | 'polygons' |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
Returns:
| Type | Description |
|---|---|
List[str] | A list of local file paths for the successfully downloaded tiles |
List[str] | for the specified country. |
Source code in gigaspatial/handlers/google_open_buildings.py
download_data_unit(tile_info, data_type='polygons') ¶
Download data file for a single tile.
The type of building data to download ('polygons' or 'points').
Defaults to 'polygons'.
Source code in gigaspatial/handlers/google_open_buildings.py
download_data_units(tiles, data_type='polygons') ¶
Download data files for multiple tiles.
The type of building data to download ('polygons' or 'points').
Defaults to 'polygons'.
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsHandler ¶
Bases: BaseHandler
Handler for Google Open Buildings dataset.
This class provides a unified interface for downloading and loading Google Open Buildings data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/google_open_buildings.py
285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 | |
create_config(data_store, logger, **kwargs) ¶
Create and return a GoogleOpenBuildingsConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
GoogleOpenBuildingsConfig | Configured GoogleOpenBuildingsConfig instance |
Source code in gigaspatial/handlers/google_open_buildings.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a GoogleOpenBuildingsDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GoogleOpenBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
GoogleOpenBuildingsDownloader | Configured GoogleOpenBuildingsDownloader instance |
Source code in gigaspatial/handlers/google_open_buildings.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a GoogleOpenBuildingsReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | GoogleOpenBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
GoogleOpenBuildingsReader | Configured GoogleOpenBuildingsReader instance |
Source code in gigaspatial/handlers/google_open_buildings.py
load_points(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load point data from Google Open Buildings dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing building point data |
Source code in gigaspatial/handlers/google_open_buildings.py
load_polygons(source, crop_to_source=False, ensure_available=True, **kwargs) ¶
Load polygon data from Google Open Buildings dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Union[str, List[Union[tuple, Point]], BaseGeometry, GeoDataFrame, Path, List[Union[str, Path]]] | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing building polygon data |
Source code in gigaspatial/handlers/google_open_buildings.py
GoogleOpenBuildingsReader ¶
Bases: BaseHandlerReader
Reader for Google Open Buildings data, supporting country, points, and geometry-based resolution.
Source code in gigaspatial/handlers/google_open_buildings.py
load_from_paths(source_data_path, **kwargs) ¶
Load building data from Google Open Buildings dataset. Args: source_data_path: List of file paths to load Returns: GeoDataFrame containing building data
Source code in gigaspatial/handlers/google_open_buildings.py
load_points(source, crop_to_source=False, **kwargs) ¶
This is a convenience method to load points data
Source code in gigaspatial/handlers/google_open_buildings.py
load_polygons(source, crop_to_source=False, **kwargs) ¶
This is a convenience method to load polygons data
Source code in gigaspatial/handlers/google_open_buildings.py
hdx ¶
HDXConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for HDX data access
Source code in gigaspatial/handlers/hdx.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 | |
output_dir_path: Path property ¶
Path to save the downloaded HDX dataset
configure_hdx() ¶
Configure HDX API if not already configured
Source code in gigaspatial/handlers/hdx.py
extract_search_geometry(source, **kwargs) ¶
Override the base class method since geometry extraction does not apply. Returns dictionary to filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | Either a country name/code (str) or a filter dictionary | required | |
**kwargs | Additional keyword arguments passed to the specific method | {} |
Source code in gigaspatial/handlers/hdx.py
fetch_dataset() ¶
Get the HDX dataset
Source code in gigaspatial/handlers/hdx.py
get_data_unit_path(unit, **kwargs) ¶
Get the path for a data unit
get_dataset_resources(filter=None, exact_match=False) ¶
Get resources from the HDX dataset
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter | Optional[Dict[str, Any]] | Dictionary of key-value pairs to filter resources | None |
exact_match | bool | If True, perform exact matching. If False, use pattern matching | False |
Source code in gigaspatial/handlers/hdx.py
list_resources() ¶
List all resources in the dataset directory using the data_store.
Source code in gigaspatial/handlers/hdx.py
search_datasets(query, rows=None, sort='relevance asc, metadata_modified desc', hdx_site='prod', user_agent='gigaspatial') staticmethod ¶
Search for datasets in HDX before initializing the class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query | str | Search query string | required |
rows | int | Number of results per page. Defaults to all datasets (sys.maxsize). | None |
sort | str | Sort order - one of 'relevance', 'views_recent', 'views_total', 'last_modified' (default: 'relevance') | 'relevance asc, metadata_modified desc' |
hdx_site | str | HDX site to use - 'prod' or 'test' (default: 'prod') | 'prod' |
user_agent | str | User agent for HDX API requests (default: 'gigaspatial') | 'gigaspatial' |
Returns:
| Type | Description |
|---|---|
List[Dict] | List of dataset dictionaries containing search results |
Example
results = HDXConfig.search_datasets("population", rows=5) for dataset in results: print(f"Name: {dataset['name']}, Title: {dataset['title']}")
Source code in gigaspatial/handlers/hdx.py
HDXDownloader ¶
Bases: BaseHandlerDownloader
Downloader for HDX datasets
Source code in gigaspatial/handlers/hdx.py
download_data_unit(resource, **kwargs) ¶
Download a single resource
Source code in gigaspatial/handlers/hdx.py
download_data_units(resources, **kwargs) ¶
Download multiple resources sequentially
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resources | List[Resource] | List of HDX Resource objects | required |
**kwargs | Additional keyword arguments | {} |
Returns:
| Type | Description |
|---|---|
List[str] | List of paths to downloaded files |
Source code in gigaspatial/handlers/hdx.py
HDXHandler ¶
Bases: BaseHandler
Handler for HDX datasets
Source code in gigaspatial/handlers/hdx.py
create_config(data_store, logger, **kwargs) ¶
Create and return a HDXConfig instance
Source code in gigaspatial/handlers/hdx.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a HDXDownloader instance
Source code in gigaspatial/handlers/hdx.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a HDXReader instance
Source code in gigaspatial/handlers/hdx.py
HDXReader ¶
Bases: BaseHandlerReader
Reader for HDX datasets
Source code in gigaspatial/handlers/hdx.py
load_from_paths(source_data_path, **kwargs) ¶
Load data from paths
Source code in gigaspatial/handlers/hdx.py
healthsites ¶
HealthSitesFetcher ¶
Fetch and process health facility location data from the Healthsites.io API.
Source code in gigaspatial/handlers/healthsites.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | |
fetch_facilities(**kwargs) ¶
Fetch and process health facility locations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Additional parameters for customization - country: Override country filter - extent: Override extent filter - from_date: Get data modified from this timestamp (datetime or string) - to_date: Get data modified to this timestamp (datetime or string) - page_size: Override default page size - sleep_time: Override default sleep time between requests - max_pages: Limit the number of pages to fetch - output_format: Override output format ('json' or 'geojson') - flat_properties: Override flat properties setting | {} |
Returns:
| Type | Description |
|---|---|
Union[DataFrame, GeoDataFrame] | Union[pd.DataFrame, gpd.GeoDataFrame]: Health facilities data. Returns GeoDataFrame for geojson format, DataFrame for json format. |
Source code in gigaspatial/handlers/healthsites.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 | |
fetch_facility_by_id(osm_type, osm_id) ¶
Fetch a specific facility by OSM type and ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
osm_type | str | OSM type (node, way, relation) | required |
osm_id | str | OSM ID | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Facility details |
Source code in gigaspatial/handlers/healthsites.py
fetch_statistics(**kwargs) ¶
Fetch statistics for health facilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | Same filtering parameters as fetch_facilities | {} |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dict | Statistics data |
Source code in gigaspatial/handlers/healthsites.py
mapbox_image ¶
MapboxImageDownloader ¶
Class to download images from Mapbox Static Images API using a specific style
Source code in gigaspatial/handlers/mapbox_image.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | |
__init__(access_token=config.MAPBOX_ACCESS_TOKEN, style_id=None, data_store=None) ¶
Initialize the downloader with Mapbox credentials
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
access_token | str | Mapbox access token | MAPBOX_ACCESS_TOKEN |
style_id | Optional[str] | Mapbox style ID to use for image download | None |
data_store | Optional[DataStore] | Instance of DataStore for accessing data storage | None |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_bounds(gdf, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_') ¶
Download images for given points using the specified style
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf_points | GeoDataFrame containing bounding box polygons | required | |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_coordinates(data, res_meters_pixel, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_') ¶
Download images for given coordinates by creating bounded boxes around points
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data | Union[DataFrame, List[Tuple[float, float]]] | Either a DataFrame with either latitude/longitude columns or a geometry column or a list of (lat, lon) tuples | required |
res_meters_pixel | float | Size of the bounding box in meters (creates a square) | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
download_images_by_tiles(mercator_tiles, output_dir, image_size=(512, 512), max_workers=4, image_prefix='image_') ¶
Download images for given mercator tiles using the specified style
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mercator_tiles | MercatorTiles | MercatorTiles instance containing quadkeys | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
max_workers | int | Maximum number of concurrent downloads | 4 |
image_prefix | str | Prefix for output image names | 'image_' |
Source code in gigaspatial/handlers/mapbox_image.py
maxar_image ¶
MaxarConfig ¶
Bases: BaseModel
Configuration for Maxar Image Downloader using Pydantic
Source code in gigaspatial/handlers/maxar_image.py
wms_url: str property ¶
Generate the full WMS URL with connection string
validate_non_empty(value, field) classmethod ¶
Ensure required credentials are provided
Source code in gigaspatial/handlers/maxar_image.py
MaxarImageDownloader ¶
Class to download images from Maxar
Source code in gigaspatial/handlers/maxar_image.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 | |
__init__(config=None, data_store=None) ¶
Initialize the downloader with Maxar config.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[MaxarConfig] | MaxarConfig instance containing credentials and settings | None |
data_store | Optional[DataStore] | Instance of DataStore for accessing data storage | None |
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_bounds(gdf, output_dir, image_size=(512, 512), image_prefix='maxar_image_') ¶
Download images for given points using the specified style
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf_points | GeoDataFrame containing bounding box polygons | required | |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_coordinates(data, res_meters_pixel, output_dir, image_size=(512, 512), image_prefix='maxar_image_') ¶
Download images for given coordinates by creating bounded boxes around points
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data | Union[DataFrame, List[Tuple[float, float]]] | Either a DataFrame with either latitude/longitude columns or a geometry column or a list of (lat, lon) tuples | required |
res_meters_pixel | float | resolution in meters per pixel | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
Source code in gigaspatial/handlers/maxar_image.py
download_images_by_tiles(mercator_tiles, output_dir, image_size=(512, 512), image_prefix='maxar_image_') ¶
Download images for given mercator tiles using the specified style
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mercator_tiles | MercatorTiles | MercatorTiles instance containing quadkeys | required |
output_dir | Union[str, Path] | Directory to save images | required |
image_size | Tuple[int, int] | Tuple of (width, height) for output images | (512, 512) |
image_prefix | str | Prefix for output image names | 'maxar_image_' |
Source code in gigaspatial/handlers/maxar_image.py
microsoft_global_buildings ¶
MSBuildingsConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for Microsoft Global Buildings dataset files.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | |
__post_init__() ¶
Initialize the configuration, load tile URLs, and set up location mapping.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_location_mapping(similarity_score_threshold=0.8) ¶
Create a mapping between the dataset's location names and ISO 3166-1 alpha-3 country codes.
This function iterates through known countries and attempts to find matching locations in the dataset based on string similarity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
similarity_score_threshold | float | The minimum similarity score (between 0 and 1) for a dataset location to be considered a match for a country. Defaults to 0.8. | 0.8 |
Returns:
| Type | Description |
|---|---|
| A dictionary where keys are dataset location names and values are | |
| the corresponding ISO 3166-1 alpha-3 country codes. |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
extract_search_geometry(source, **kwargs) ¶
get_relevant_data_units_by_geometry(geometry, **kwargs) ¶
Get the DataFrame of Microsoft Buildings tiles that intersect with a given source spatial geometry.
In case country given, this method first tries to find tiles directly mapped to the given country. If no directly mapped tiles are found and the country is not in the location mapping, it attempts to find overlapping tiles by creating Mercator tiles for the country and filtering the dataset's tiles.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of Microsoft's Global ML Building Footprints dataset.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 | |
__init__(config=None, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[MSBuildingsConfig] | Optional configuration for customizing download behavior and file paths. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_by_country(country, data_store=None, country_geom_path=None) ¶
Download Microsoft Global ML Building Footprints data for a specific country.
This is a convenience method to download data for an entire country using its code or name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country | str | The country code (e.g., 'USA', 'GBR') or name. | required |
data_store | Optional[DataStore] | Optional instance of a | None |
country_geom_path | Optional[Union[str, Path]] | Optional path to a GeoJSON file containing the country boundary. If provided, this boundary is used instead of the default from | None |
Returns:
| Type | Description |
|---|---|
List[str] | A list of local file paths for the successfully downloaded tiles. |
List[str] | Returns an empty list if no data is found for the country or if |
List[str] | all downloads fail. |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_data_unit(tile_info, **kwargs) ¶
Download data file for a single tile.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
download_data_units(tiles, **kwargs) ¶
Download data files for multiple tiles.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsHandler ¶
Bases: BaseHandler
Handler for Microsoft Global Buildings dataset.
This class provides a unified interface for downloading and loading Microsoft Global Buildings data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_config(data_store, logger, **kwargs) ¶
Create and return a MSBuildingsConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
MSBuildingsConfig | Configured MSBuildingsConfig instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a MSBuildingsDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | MSBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
MSBuildingsDownloader | Configured MSBuildingsDownloader instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a MSBuildingsReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | MSBuildingsConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
MSBuildingsReader | Configured MSBuildingsReader instance |
Source code in gigaspatial/handlers/microsoft_global_buildings.py
MSBuildingsReader ¶
Bases: BaseHandlerReader
Reader for Microsoft Global Buildings data, supporting country, points, and geometry-based resolution.
Source code in gigaspatial/handlers/microsoft_global_buildings.py
load_from_paths(source_data_path, **kwargs) ¶
Load building data from Microsoft Buildings dataset. Args: source_data_path: List of file paths to load Returns: GeoDataFrame containing building data
Source code in gigaspatial/handlers/microsoft_global_buildings.py
ookla_speedtest ¶
OoklaSpeedtestConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration class for Ookla Speedtest data.
This class defines the parameters for accessing and filtering Ookla Speedtest datasets, including available years, quarters, and how dataset URLs are constructed.
Source code in gigaspatial/handlers/ookla_speedtest.py
get_data_unit_path(unit, **kwargs) ¶
Given a Ookla Speedtest file url, return the corresponding path.
OoklaSpeedtestDownloader ¶
Bases: BaseHandlerDownloader
A class to handle the downloading of Ookla Speedtest data.
This downloader focuses on fetching parquet files based on the provided configuration and data unit URLs.
Source code in gigaspatial/handlers/ookla_speedtest.py
OoklaSpeedtestHandler ¶
Bases: BaseHandler
Handler for Ookla Speedtest data.
This class orchestrates the configuration, downloading, and reading of Ookla Speedtest data, allowing for filtering by geographical sources using Mercator tiles.
Source code in gigaspatial/handlers/ookla_speedtest.py
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 | |
OoklaSpeedtestReader ¶
Bases: BaseHandlerReader
A class to handle reading Ookla Speedtest data.
It loads parquet files into a DataFrame.
Source code in gigaspatial/handlers/ookla_speedtest.py
opencellid ¶
OpenCellIDConfig ¶
Bases: BaseModel
Configuration for OpenCellID data access
Source code in gigaspatial/handlers/opencellid.py
output_file_path: Path property ¶
Path to save the downloaded OpenCellID data
OpenCellIDDownloader ¶
Downloader for OpenCellID data
Source code in gigaspatial/handlers/opencellid.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 | |
download_and_process() ¶
Download and process OpenCellID data for the configured country
Source code in gigaspatial/handlers/opencellid.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 | |
from_country(country, api_token=global_config.OPENCELLID_ACCESS_TOKEN, **kwargs) classmethod ¶
Create a downloader for a specific country
Source code in gigaspatial/handlers/opencellid.py
get_download_links() ¶
Get download links for the country from OpenCellID website
Source code in gigaspatial/handlers/opencellid.py
OpenCellIDReader ¶
Reader for OpenCellID data
Source code in gigaspatial/handlers/opencellid.py
read_data() ¶
Read OpenCellID data for the specified country
Source code in gigaspatial/handlers/opencellid.py
to_geodataframe() ¶
Convert OpenCellID data to a GeoDataFrame
osm ¶
OSMLocationFetcher ¶
A class to fetch and process location data from OpenStreetMap using the Overpass API.
This class supports fetching various OSM location types including amenities, buildings, shops, and other POI categories.
Source code in gigaspatial/handlers/osm.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 | |
__post_init__() ¶
Validate inputs, normalize location_types, and set up logging.
Source code in gigaspatial/handlers/osm.py
fetch_locations(since_date=None, handle_duplicates='separate', include_metadata=False) ¶
Fetch OSM locations, optionally filtered by 'since' date.
Use this for incremental updates or getting all current locations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
since_year | int | Filter for locations added/modified since this year. | required |
handle_duplicates | str | How to handle objects matching multiple categories: - 'separate': Create separate entries for each category (default) - 'combine': Use a single entry with a list of matching categories - 'primary': Keep only the first matching category | 'separate' |
include_metadata | bool | If True, include change tracking metadata (timestamp, version, changeset, user, uid) | False |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: Processed OSM locations |
Source code in gigaspatial/handlers/osm.py
fetch_locations_changed_between(start_date, end_date, handle_duplicates='separate', include_metadata=True) ¶
Fetch OSM locations that changed within a specific date range.
Use this for historical analysis or tracking changes in a specific period.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
start_date | Union[str, datetime] | Start date/time in ISO 8601 format (str: "YYYY-MM-DDThh:mm:ssZ") or datetime object. Changes after this date will be included. | required |
end_date | Union[str, datetime] | End date/time in ISO 8601 format (str: "YYYY-MM-DDThh:mm:ssZ") or datetime object. Changes before this date will be included. | required |
handle_duplicates | Literal['separate', 'combine', 'primary'] | How to handle objects matching multiple categories: - 'separate': Create separate entries for each category (default) - 'combine': Use a single entry with a list of matching categories - 'primary': Keep only the first matching category | 'separate' |
include_metadata | bool | If True, include change tracking metadata (timestamp, version, changeset, user, uid) Defaults to True since change tracking is the main use case. | True |
Returns:
| Type | Description |
|---|---|
DataFrame | pd.DataFrame: Processed OSM locations that changed within the date range |
Raises:
| Type | Description |
|---|---|
ValueError | If dates are invalid or start_date is after end_date |
Source code in gigaspatial/handlers/osm.py
get_admin_names(admin_level, country=None, timeout=120) staticmethod ¶
Fetch all admin area names for a given admin_level (optionally within a country).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
admin_level | int | The OSM admin_level to search for (e.g., 4 for states, 6 for counties). | required |
country | str | Country name or ISO code to filter within. | None |
timeout | int | Timeout for the Overpass API request. | 120 |
Returns:
| Type | Description |
|---|---|
List[str] | List[str]: List of admin area names. |
Source code in gigaspatial/handlers/osm.py
get_osm_countries(iso3_code=None, include_names=True, timeout=1000) staticmethod ¶
Fetch countries from OpenStreetMap database.
This queries the actual OSM database for country boundaries and returns country names as they appear in OSM, including various name translations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iso3_code | str | ISO 3166-1 alpha-3 code to fetch a specific country. If provided, returns single country data. If None, returns all countries. | None |
include_names | bool | If True, return dict with multiple name variants. If False, return only the primary name. | True |
timeout | int | Timeout for the Overpass API request (default: 1000). | 1000 |
Returns:
| Type | Description |
|---|---|
Union[str, Dict[str, str], List[str], List[Dict[str, str]]] | When iso3_code is provided: - If include_names=False: Single country name (str) - If include_names=True: Dict with name variants |
Union[str, Dict[str, str], List[str], List[Dict[str, str]]] | When iso3_code is None: - If include_names=False: List of country names - If include_names=True: List of dicts with name variants including: name, name:en, ISO3166-1 codes, and other name translations |
Raises:
| Type | Description |
|---|---|
ValueError | If iso3_code is provided but country not found in OSM. |
Source code in gigaspatial/handlers/osm.py
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
overture ¶
OvertureAmenityFetcher ¶
A class to fetch and process amenity locations from Overture.
Source code in gigaspatial/handlers/overture.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | |
__post_init__() ¶
Validate inputs and set up logging.
Source code in gigaspatial/handlers/overture.py
fetch_locations(match_pattern=False, **kwargs) ¶
Fetch and process amenity locations.
Source code in gigaspatial/handlers/overture.py
rwi ¶
RWIConfig dataclass ¶
Bases: HDXConfig
Configuration for Relative Wealth Index data access
Source code in gigaspatial/handlers/rwi.py
RWIDownloader ¶
Bases: HDXDownloader
Specialized downloader for the Relative Wealth Index dataset from HDX
Source code in gigaspatial/handlers/rwi.py
RWIHandler ¶
Bases: HDXHandler
Handler for Relative Wealth Index dataset
Source code in gigaspatial/handlers/rwi.py
create_config(data_store, logger, **kwargs) ¶
Create and return a RWIConfig instance
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a RWIDownloader instance
Source code in gigaspatial/handlers/rwi.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a RWIReader instance
Source code in gigaspatial/handlers/rwi.py
RWIReader ¶
Bases: HDXReader
Specialized reader for the Relative Wealth Index dataset from HDX
Source code in gigaspatial/handlers/rwi.py
srtm ¶
nasa_srtm ¶
NasaSRTMConfig dataclass ¶
Bases: BaseHandlerConfig
Configuration for NASA SRTM .hgt tiles (30m or 90m). Creates tile geometries dynamically for 1°x1° grid cells.
Each tile file covers 1 degree latitude x 1 degree longitude.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
get_data_unit_path(unit, **kwargs) ¶
Given a tile unit or tile_id, return expected storage path.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
get_data_unit_paths(units, **kwargs) ¶
Given tile identifiers, return list of file paths.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
NasaSRTMDownloader ¶
Bases: BaseHandlerDownloader
A class to handle downloads of NASA SRTM elevation data.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | |
__init__(config=None, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[NasaSRTMConfig] | Optional configuration for customizing download behavior and file paths. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
download_data_unit(tile_info, **kwargs) ¶
Download data file for a single SRTM tile.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
download_data_units(tiles, **kwargs) ¶
Download data files for multiple SRTM tiles.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
NasaSRTMHandler ¶
Bases: BaseHandler
Main handler class for NASA SRTM elevation data.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
create_config(data_store, logger, **kwargs) ¶
Create and return a NasaSRTMConfig instance.
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a NasaSRTMDownloader instance.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a NasaSRTMReader instance.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
NasaSRTMReader ¶
Bases: BaseHandlerReader
A class to handle reading of NASA SRTM elevation data.
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
__init__(config=None, data_store=None, logger=None) ¶
Initialize the reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Optional[NasaSRTMConfig] | Optional configuration for customizing reading behavior and file paths. If None, a default | None |
data_store | Optional[DataStore] | Optional instance of a | None |
logger | Optional[Logger] | Optional custom logger instance. If None, a default logger named after the module is created and used. | None |
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
load_from_paths(source_data_path, **kwargs) ¶
Load SRTM elevation data from file paths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_data_path | List[Union[str, Path]] | List of SRTM .hgt.zip file paths | required |
**kwargs | Additional parameters for data loading - as_dataframe: bool, default=True. If True, return concatenated DataFrame. If False, return list of SRTMParser objects. - dropna: bool, default=True. If True, drop rows with NaN elevation values. | {} |
Returns:
| Type | Description |
|---|---|
Union[DataFrame, List[SRTMParser]] | Union[pd.DataFrame, List[SRTMParser]]: Loaded elevation data |
Source code in gigaspatial/handlers/srtm/nasa_srtm.py
srtm_manager ¶
SRTMManager ¶
Manager for accessing elevation data across multiple SRTM .hgt.zip files.
Implements lazy loading with LRU caching for efficient memory usage. Automatically handles multiple tiles for elevation profiles.
Source code in gigaspatial/handlers/srtm/srtm_manager.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 | |
__init__(srtm_directory, downloader=None, cache_size=10, data_store=None) ¶
Initialize the SRTM Manager.
Parameters¶
srtm_directory : str or Path Directory containing .hgt.zip files downloader : optional Downloader instance for auto-downloading missing tiles cache_size : int, default=10 Maximum number of SRTM tiles to keep in memory (LRU cache) data_store : DataStore, optional Data store for reading files. Priority: provided data_store > downloader.data_store > LocalDataStore()
Source code in gigaspatial/handlers/srtm/srtm_manager.py
check_coverage(latitude, longitude) ¶
Check if a specific coordinate has SRTM coverage.
Parameters¶
latitude : float Latitude in decimal degrees longitude : float Longitude in decimal degrees
Returns¶
bool True if tile is available, False otherwise
Source code in gigaspatial/handlers/srtm/srtm_manager.py
clear_cache() ¶
get_available_tiles() ¶
get_cache_info() ¶
get_elevation(latitude, longitude) ¶
Get interpolated elevation for a specific coordinate.
Automatically finds and loads the correct SRTM tile.
Parameters¶
latitude : float Latitude in decimal degrees (-90 to 90) longitude : float Longitude in decimal degrees (-180 to 180)
Returns¶
float Interpolated elevation in meters
Raises¶
FileNotFoundError If the required SRTM tile is not available
Source code in gigaspatial/handlers/srtm/srtm_manager.py
get_elevation_batch(coordinates) ¶
Get elevations for multiple coordinates efficiently.
Groups coordinates by tile to minimize parser loads.
Parameters¶
coordinates : np.ndarray of shape (n, 2) Array of (latitude, longitude) pairs
Returns¶
np.ndarray of shape (n,) Elevations in meters
Raises¶
FileNotFoundError If any required SRTM tile is not available
Source code in gigaspatial/handlers/srtm/srtm_manager.py
get_elevation_profile(start_lat, start_lon, end_lat, end_lon, num_points=100) ¶
Get elevation profile between two points.
Uses linear interpolation between points and automatically handles multiple SRTM tiles. For more accurate great circle paths over long distances, consider using geopy.
Parameters¶
start_lat : float Starting latitude in decimal degrees start_lon : float Starting longitude in decimal degrees end_lat : float Ending latitude in decimal degrees end_lon : float Ending longitude in decimal degrees num_points : int, default=100 Number of sample points along the path
Returns¶
pd.DataFrame DataFrame with columns: distance_km, latitude, longitude, elevation
Raises¶
FileNotFoundError If any required SRTM tile along the path is not available
Source code in gigaspatial/handlers/srtm/srtm_manager.py
srtm_parser ¶
SRTMParser ¶
Efficient parser for NASA SRTM .hgt.zip files.
Supports both SRTM-1 (3601x3601, 1 arc-second) and SRTM-3 (1201x1201, 3 arc-second) formats. Uses memory mapping for efficient handling of large files.
Source code in gigaspatial/handlers/srtm/srtm_parser.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | |
__init__(hgt_zip_path, data_store=None) ¶
Initialize the SRTM parser.
Parameters¶
hgt_zip_path : str or Path Path to the .hgt.zip file (e.g., 'S03E028.SRTMGL1.hgt.zip') data_store : DataStore, optional Data store for reading files. If None, uses LocalDataStore()
Source code in gigaspatial/handlers/srtm/srtm_parser.py
get_elevation(latitude, longitude) ¶
Get interpolated elevation for a specific coordinate.
Uses bilinear interpolation for accurate elevation values between grid points.
Parameters¶
latitude : float Latitude in decimal degrees longitude : float Longitude in decimal degrees
Returns¶
float Interpolated elevation in meters, or np.nan if outside tile bounds
Source code in gigaspatial/handlers/srtm/srtm_parser.py
get_elevation_batch(coordinates) ¶
Get interpolated elevations for multiple coordinates (vectorized).
Parameters¶
coordinates : np.ndarray of shape (n, 2) Array of (latitude, longitude) pairs
Returns¶
np.ndarray of shape (n,) Interpolated elevations in meters
Source code in gigaspatial/handlers/srtm/srtm_parser.py
get_tile_info() ¶
Get information about the SRTM tile.
Returns¶
dict Dictionary containing tile metadata
Source code in gigaspatial/handlers/srtm/srtm_parser.py
to_array() ¶
Return elevation data in square array form with coordinate arrays.
Returns¶
tuple of (elevation_array, latitudes, longitudes) elevation_array : np.ndarray of shape (size, size) 2D array of elevation values in meters latitudes : np.ndarray of shape (size,) Latitude values for each row (north to south) longitudes : np.ndarray of shape (size,) Longitude values for each column (west to east)
Source code in gigaspatial/handlers/srtm/srtm_parser.py
to_dataframe(dropna=True) ¶
Convert elevation data to a DataFrame with coordinates.
Returns¶
pd.DataFrame DataFrame with columns: latitude, longitude, elevation
Source code in gigaspatial/handlers/srtm/srtm_parser.py
utils ¶
EarthdataSession ¶
Bases: Session
Custom requests.Session for NASA Earthdata authentication.
Maintains Authorization headers through redirects to/from Earthdata hosts. This is required because Earthdata uses multiple redirect domains during authentication.
Source code in gigaspatial/handlers/srtm/utils.py
rebuild_auth(prepared_request, response) ¶
Keep auth header on redirects to/from Earthdata host.
Source code in gigaspatial/handlers/srtm/utils.py
unicef_georepo ¶
GeoRepoClient ¶
A client for interacting with the GeoRepo API.
GeoRepo is a platform for managing and accessing geospatial administrative boundary data. This client provides methods to search, retrieve, and work with modules, datasets, views, and administrative entities.
Attributes:
| Name | Type | Description |
|---|---|---|
base_url | str | The base URL for the GeoRepo API |
api_key | str | The API key for authentication |
email | str | The email address associated with the API key |
headers | dict | HTTP headers used for API requests |
Source code in gigaspatial/handlers/unicef_georepo.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 | |
__init__(api_key=None, email=None) ¶
Initialize the GeoRepo client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key | str | GeoRepo API key. If not provided, will use the GEOREPO_API_KEY environment variable from config. | None |
email | str | Email address associated with the API key. If not provided, will use the GEOREPO_USER_EMAIL environment variable from config. | None |
Raises:
| Type | Description |
|---|---|
ValueError | If api_key or email is not provided and cannot be found in environment variables. |
Source code in gigaspatial/handlers/unicef_georepo.py
check_connection() ¶
Checks if the API connection is valid by making a simple request.
Returns:
| Name | Type | Description |
|---|---|---|
bool | True if the connection is valid, False otherwise. |
Source code in gigaspatial/handlers/unicef_georepo.py
find_country_by_iso3(view_uuid, iso3_code) ¶
Find a country entity using its ISO3 country code.
This method searches through all level-0 (country) entities to find one that matches the provided ISO3 code. It checks both the entity's Ucode and any external codes stored in the ext_codes field.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to search within. | required |
iso3_code | str | The ISO3 country code to search for (e.g., 'USA', 'KEN', 'BRA'). | required |
Returns:
| Type | Description |
|---|---|
| dict or None: Entity information dictionary for the matching country if found, including properties like name, ucode, admin_level, etc. Returns None if no matching country is found. |
Note
This method handles pagination automatically to search through all available countries in the dataset, which may involve multiple API calls.
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or view_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_admin_boundaries(view_uuid, admin_level=None, geom='full_geom', format='geojson') ¶
Get administrative boundaries for a specific level or all levels.
This is a convenience method that can retrieve boundaries for a single administrative level or attempt to fetch all available levels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to query. | required |
admin_level | int | Administrative level to retrieve (0=country, 1=region, etc.). If None, attempts to fetch all levels. | None |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "full_geom". | 'full_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "geojson". | 'geojson' |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON/GeoJSON response containing administrative boundaries in the specified format. For GeoJSON, returns a FeatureCollection. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_dataset_details(dataset_uuid) ¶
Get detailed information about a specific dataset.
This includes metadata about the dataset and information about available administrative levels (e.g., country, province, district).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_uuid | str | The UUID of the dataset to query. | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing dataset details including: - Basic metadata (name, description, etc.) - Available administrative levels and their properties - Temporal information and data sources |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or dataset_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_entity_by_ucode(ucode, geom='full_geom', format='geojson') ¶
Get detailed information about a specific entity using its Ucode.
A Ucode (Universal Code) is a unique identifier for geographic entities within the GeoRepo system, typically in the format "ISO3_LEVEL_NAME".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ucode | str | The unique code identifier for the entity. | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "full_geom". | 'full_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "geojson". | 'geojson' |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON/GeoJSON response containing entity details including geometry, properties, administrative level, and metadata. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or ucode is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_vector_tiles_url(view_info) ¶
Generate an authenticated URL for accessing vector tiles.
Vector tiles are used for efficient map rendering and can be consumed by mapping libraries like Mapbox GL JS or OpenLayers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_info | dict | Dictionary containing view information that must include a 'vector_tiles' key with the base vector tiles URL. | required |
Returns:
| Name | Type | Description |
|---|---|---|
str | Fully authenticated vector tiles URL with API key and user email parameters appended for access control. |
Raises:
| Type | Description |
|---|---|
ValueError | If 'vector_tiles' key is not found in view_info. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_datasets_by_module(module_uuid) ¶
List all datasets within a specific module.
A dataset represents a collection of related geographic entities, such as administrative boundaries for a specific country or region.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
module_uuid | str | The UUID of the module to query. | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing a list of datasets with their metadata. Each dataset includes 'uuid', 'name', 'description', creation date, etc. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or module_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_entities_by_admin_level(view_uuid, admin_level, geom='no_geom', format='json', page=1, page_size=50) ¶
List entities at a specific administrative level within a view.
Administrative levels typically follow a hierarchy: - Level 0: Countries - Level 1: States/Provinces/Regions - Level 2: Districts/Counties - Level 3: Sub-districts/Municipalities - And so on...
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to query. | required |
admin_level | int | The administrative level to retrieve (0, 1, 2, etc.). | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "no_geom". | 'no_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "json". | 'json' |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
| Name | Type | Description |
|---|---|---|
tuple | A tuple containing: - dict: JSON/GeoJSON response with entity data - dict: Metadata with pagination info (page, total_page, total_count) |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_entity_children(view_uuid, entity_ucode, geom='no_geom', format='json') ¶
List direct children of an entity in the administrative hierarchy.
For example, if given a country entity, this will return its states/provinces. If given a state entity, this will return its districts/counties.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view containing the entity. | required |
entity_ucode | str | The Ucode of the parent entity. | required |
geom | str | Geometry inclusion level. Options: - "no_geom": No geometry data - "centroid": Only centroid points - "full_geom": Complete boundary geometries Defaults to "no_geom". | 'no_geom' |
format | str | Response format ("json" or "geojson"). Defaults to "json". | 'json' |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON/GeoJSON response containing list of child entities with their properties and optional geometry data. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_modules() ¶
List all available modules in GeoRepo.
A module is a top-level organizational unit that contains datasets. Examples include "Admin Boundaries", "Health Facilities", etc.
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing a list of modules with their metadata. Each module includes 'uuid', 'name', 'description', and other properties. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails. |
Source code in gigaspatial/handlers/unicef_georepo.py
list_views_by_dataset(dataset_uuid, page=1, page_size=50) ¶
List views for a dataset with pagination support.
A view represents a specific version or subset of a dataset. Views may be tagged as 'latest' or represent different time periods.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_uuid | str | The UUID of the dataset to query. | required |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing paginated list of views with metadata. Includes 'results', 'total_page', 'current_page', and 'count' fields. Each view includes 'uuid', 'name', 'tags', and other properties. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or dataset_uuid is invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
search_entities_by_name(view_uuid, name, page=1, page_size=50) ¶
Search for entities by name using fuzzy matching.
This performs a similarity-based search to find entities whose names match or are similar to the provided search term.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
view_uuid | str | The UUID of the view to search within. | required |
name | str | The name or partial name to search for. | required |
page | int | Page number for pagination. Defaults to 1. | 1 |
page_size | int | Number of results per page. Defaults to 50. | 50 |
Returns:
| Name | Type | Description |
|---|---|---|
dict | JSON response containing paginated search results with matching entities and their similarity scores. |
Raises:
| Type | Description |
|---|---|
HTTPError | If the API request fails or parameters are invalid. |
Source code in gigaspatial/handlers/unicef_georepo.py
find_admin_boundaries_module() ¶
Find and return the UUID of the Admin Boundaries module.
This is a convenience function that searches through all available modules to locate the one named "Admin Boundaries", which typically contains administrative boundary datasets.
Returns:
| Name | Type | Description |
|---|---|---|
str | The UUID of the Admin Boundaries module. |
Raises:
| Type | Description |
|---|---|
ValueError | If the Admin Boundaries module is not found. |
Source code in gigaspatial/handlers/unicef_georepo.py
get_country_boundaries_by_iso3(iso3_code, client=None, admin_level=None) ¶
Get administrative boundaries for a specific country using its ISO3 code.
This function provides a high-level interface to retrieve country boundaries by automatically finding the appropriate module, dataset, and view, then fetching the requested administrative boundaries.
The function will: 1. Find the Admin Boundaries module 2. Locate a global dataset within that module 3. Find the latest view of that dataset 4. Search for the country using the ISO3 code 5. Look for a country-specific view if available 6. Retrieve boundaries at the specified admin level or all levels
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iso3_code | str | The ISO3 country code (e.g., 'USA', 'KEN', 'BRA'). | required |
admin_level | int | The administrative level to retrieve: - 0: Country level - 1: State/Province/Region level - 2: District/County level - 3: Sub-district/Municipality level - etc. If None, retrieves all available administrative levels. | None |
Returns:
| Name | Type | Description |
|---|---|---|
dict | A GeoJSON FeatureCollection containing the requested boundaries. Each feature includes geometry and properties for the administrative unit. |
Raises:
| Type | Description |
|---|---|
ValueError | If the Admin Boundaries module, datasets, views, or country cannot be found. |
HTTPError | If any API requests fail. |
Note
This function may make multiple API calls and can take some time for countries with many administrative units. It handles pagination automatically and attempts to use country-specific views when available for better performance.
Example
Get all administrative levels for Kenya¶
boundaries = get_country_boundaries_by_iso3('KEN')
Get only province-level boundaries for Kenya¶
provinces = get_country_boundaries_by_iso3('KEN', admin_level=1)
Source code in gigaspatial/handlers/unicef_georepo.py
485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 | |
worldpop ¶
WPPopulationConfig dataclass ¶
Bases: BaseHandlerConfig
Source code in gigaspatial/handlers/worldpop.py
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 | |
extract_search_geometry(source, **kwargs) ¶
Override the method since geometry extraction does not apply. Returns country iso3 for dataset search
Source code in gigaspatial/handlers/worldpop.py
get_data_unit_path(unit, **kwargs) ¶
get_data_unit_paths(units, **kwargs) ¶
Given WP file url(s), return the corresponding local file paths.
- For school_age age_structures (zip resources), if extracted .tif files are present in the target directory, return those; otherwise, return the zip path(s) to allow the downloader to fetch and extract them.
- For non-school_age age_structures (individual .tif URLs), you can filter by sex and age using kwargs: sex, ages, min_age, max_age.
Source code in gigaspatial/handlers/worldpop.py
659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 | |
validate_configuration() ¶
Validate that the configuration is valid based on dataset availability constraints.
Specific rules: - For age_structures: - School age data is only available for 2020 at 1km resolution. - Non-school age data is only available at 100m resolution. - Unconstrained, non-school age data is only available without UN adjustment. - Constrained, non-school age data with UN adjustment is only available for 2020. - Constrained, non-school age data without UN adjustment is only available for 2020 and 2024. - For pop: - 2024 data is only available at 100m resolution and without UN adjustment. - Constrained data (other than 2024) is only available for 2020 at 100m resolution. - Unconstrained data at 100m or 1km is available for other years, with or without UN adjustment.
Source code in gigaspatial/handlers/worldpop.py
498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 | |
WPPopulationDownloader ¶
Bases: BaseHandlerDownloader
Source code in gigaspatial/handlers/worldpop.py
763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 | |
__init__(config, data_store=None, logger=None) ¶
Initialize the downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[WPPopulationConfig, dict[str, Union[str, int]]] | Configuration for the WorldPop dataset, either as a WPPopulationConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/worldpop.py
download_data_unit(url, **kwargs) ¶
Download data file for a url. If a zip, extract contained .tif files.
Source code in gigaspatial/handlers/worldpop.py
786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 | |
download_data_units(urls, **kwargs) ¶
Download data files for multiple urls.
Source code in gigaspatial/handlers/worldpop.py
WPPopulationHandler ¶
Bases: BaseHandler
Handler for WorldPop Populations datasets.
This class provides a unified interface for downloading and loading WP Population data. It manages the lifecycle of configuration, downloading, and reading components.
Source code in gigaspatial/handlers/worldpop.py
981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 | |
create_config(data_store, logger, **kwargs) ¶
Create and return a WPPopulationConfig instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional configuration parameters | {} |
Returns:
| Type | Description |
|---|---|
WPPopulationConfig | Configured WPPopulationConfig instance |
Source code in gigaspatial/handlers/worldpop.py
create_downloader(config, data_store, logger, **kwargs) ¶
Create and return a WPPopulationDownloader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | WPPopulationConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional downloader parameters | {} |
Returns:
| Type | Description |
|---|---|
WPPopulationDownloader | Configured WPPopulationDownloader instance |
Source code in gigaspatial/handlers/worldpop.py
create_reader(config, data_store, logger, **kwargs) ¶
Create and return a WPPopulationReader instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | WPPopulationConfig | The configuration object | required |
data_store | DataStore | The data store instance to use | required |
logger | Logger | The logger instance to use | required |
**kwargs | Additional reader parameters | {} |
Returns:
| Type | Description |
|---|---|
WPPopulationReader | Configured WPPopulationReader instance |
Source code in gigaspatial/handlers/worldpop.py
load_into_dataframe(source, ensure_available=True, **kwargs) ¶
Load GHSL data into a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | str | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame containing the GHSL data |
Source code in gigaspatial/handlers/worldpop.py
load_into_geodataframe(source, ensure_available=True, **kwargs) ¶
Load GHSL data into a geopandas GeoDataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source | str | The data source specification | required |
ensure_available | bool | If True, ensure data is downloaded before loading | True |
**kwargs | Additional parameters passed to load methods | {} |
Returns:
| Type | Description |
|---|---|
GeoDataFrame | GeoDataFrame containing the GHSL data |
Source code in gigaspatial/handlers/worldpop.py
WPPopulationReader ¶
Bases: BaseHandlerReader
Source code in gigaspatial/handlers/worldpop.py
__init__(config, data_store=None, logger=None) ¶
Initialize the reader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Union[WPPopulationConfig, dict[str, Union[str, int]]] | Configuration for the WorldPop dataset, either as a WPPopulationConfig object or a dictionary of parameters | required |
data_store | Optional[DataStore] | Optional data storage interface. If not provided, uses LocalDataStore. | None |
logger | Optional[Logger] | Optional custom logger. If not provided, uses default logger. | None |
Source code in gigaspatial/handlers/worldpop.py
load_from_paths(source_data_path, merge_rasters=False, **kwargs) ¶
Load TifProcessors of WP datasets. Args: source_data_path: List of file paths to load merge_rasters: If True, all rasters will be merged into a single TifProcessor. Defaults to False. Returns: Union[List[TifProcessor], TifProcessor]: List of TifProcessor objects for accessing the raster data or a single TifProcessor if merge_rasters is True.
Source code in gigaspatial/handlers/worldpop.py
WorldPopRestClient ¶
REST API client for WorldPop data access.
This class provides direct access to the WorldPop REST API without any configuration dependencies, allowing flexible integration patterns.
Source code in gigaspatial/handlers/worldpop.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 | |
__enter__() ¶
__exit__(exc_type, exc_val, exc_tb) ¶
__init__(base_url='https://www.worldpop.org/rest/data', stats_url='https://api.worldpop.org/v1/services/stats', api_key=None, timeout=30, logger=None) ¶
Initialize the WorldPop REST API client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_url | str | Base URL for the WorldPop REST API | 'https://www.worldpop.org/rest/data' |
stats_url | str | URL for the WorldPop statistics API | 'https://api.worldpop.org/v1/services/stats' |
api_key | Optional[str] | Optional API key for higher rate limits | None |
timeout | int | Request timeout in seconds | 30 |
logger | Optional[Logger] | Optional logger instance | None |
Source code in gigaspatial/handlers/worldpop.py
close() ¶
find_dataset(dataset_type, category, iso3, year, **filters) ¶
Find a specific dataset by year and optional filters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias | required |
category | str | Category alias | required |
iso3 | str | ISO3 country code | required |
year | Union[str, int] | Year to search for | required |
**filters | Additional filters (e.g., gender='F', resolution='1km') | {} |
Returns:
| Type | Description |
|---|---|
Optional[Dict[str, Any]] | Dataset dictionary or None if not found |
Source code in gigaspatial/handlers/worldpop.py
get_available_projects() ¶
Get list of all available projects (e.g., population, births, pregnancies, etc.).
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of project dictionaries with alias, name, title, and description |
Source code in gigaspatial/handlers/worldpop.py
get_dataset_by_id(dataset_type, category, dataset_id) ¶
Get dataset information by ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias (e.g., 'pop', 'births') | required |
category | str | Category alias (e.g., 'wpgp', 'pic') | required |
dataset_id | str | Dataset ID | required |
Returns:
| Type | Description |
|---|---|
Optional[Dict[str, Any]] | Dataset dictionary or None if not found |
Source code in gigaspatial/handlers/worldpop.py
get_dataset_info(dataset) ¶
Extract useful information from a dataset dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset | Dict[str, Any] | Dataset dictionary from API | required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Cleaned dataset information |
Source code in gigaspatial/handlers/worldpop.py
get_datasets(dataset_type, category, params) ¶
Get all datasets available for the params.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias (e.g., 'pop', 'births') | required |
category | str | Category alias (e.g., 'wpgp', 'pic') | required |
params | dict | Query parameters (e.g., {'iso3`:'RWA'}) | required |
Returns:
| Type | Description |
|---|---|
| List of dataset dictionaries with metadata and file information |
Source code in gigaspatial/handlers/worldpop.py
get_datasets_by_country(dataset_type, category, iso3) ¶
Get all datasets available for a specific country.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias (e.g., 'pop', 'births') | required |
category | str | Category alias (e.g., 'wpgp', 'pic') | required |
iso3 | str | ISO3 country code (e.g., 'USA', 'BRA') | required |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of dataset dictionaries with metadata and file information |
Source code in gigaspatial/handlers/worldpop.py
get_project_sources(dataset_type) ¶
Get available sources for a specific project type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Project type alias (e.g., 'pop', 'births', 'pregnancies') | required |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of source dictionaries with alias and name |
Source code in gigaspatial/handlers/worldpop.py
get_source_entities(dataset_type, category) ¶
Get list of entities (countries, global, continental) available for a specific project type and source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Project type alias (e.g., 'pop', 'births') | required |
category | str | Source alias (e.g., 'wpgp', 'pic') | required |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of entity dictionaries with id and iso3 codes (if applicable) |
Source code in gigaspatial/handlers/worldpop.py
list_years_for_country(dataset_type, category, iso3) ¶
List all available years for a specific country and dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | str | Dataset type alias | required |
category | str | Category alias | required |
iso3 | str | ISO3 country code | required |
Returns:
| Type | Description |
|---|---|
List[int] | Sorted list of available years |
Source code in gigaspatial/handlers/worldpop.py
search_datasets(dataset_type=None, category=None, iso3=None, year=None, **filters) ¶
Search for datasets with flexible filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_type | Optional[str] | Optional dataset type filter | None |
category | Optional[str] | Optional category filter | None |
iso3 | Optional[str] | Optional country filter | None |
year | Optional[Union[str, int]] | Optional year filter | None |
**filters | Additional filters | {} |
Returns:
| Type | Description |
|---|---|
List[Dict[str, Any]] | List of matching datasets |